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Abstract 

We show that if M is a DFA with n states over an arbitrary alphabet and L = L(M) , 
then the worst-case state complexity of L 2 is n1 n — 2 n ~~ 1 . If, however, M is a DFA 
over a unary alphabet, then the worst-case state complexity of L k is kn — k + 1 for all 
k > 2. 



1 Introduction 

We are often interested in quantifying the complexity of a regular language L. One natural 
complexity measure for regular languages is the state complexity of L, that is, the number 
of states in the minimal deterministic finite automation (DFA) that accepts L. Given an 
operation on regular languages, we may also define the state complexity of that operation to 
be the number of states that are both sufficient and necessary in the worst-case for a DFA 
to accept the resulting language. 

Birget pQ gave exact results for the state complexities of the intersection and union 
operations on regular languages. Yu, Zhuang, and Salomaa [TUJ studied other operations, 
such as concatenation and Kleene star. For instance, Yu, Zhuang, and Salomaa proved that, 
given DFAs Mi and M2 with m and n states respectively, there exists a DFA with m2 n — 2 n 
states that accepts L(Mi)L(M2). Moreover, there exist M\ and M2 for which this bound is 
optimal. Some more recent work on the state complexity of concatenation has been done 
by Jiraskova as well as Jirasek, Jiraskova, and Szabari J§]. Birget's work [2] on the state 

complexity of S*L may also be of interest. 

We are interested here in the state complexity of the concatenation of a regular language 
L with itself, which we denote L 2 . We show that the bounds of Yu, Zhuang, and Salomaa 
for concatenation are also optimal for L 2 . In other words, if M is a DFA with n states and 



L = L(M), then the worst-case state complexity of L 2 is n2 n — 2 n ~ l . This bound, however, 
does not hold if we restrict ourselves to unary languages. Specifically, we show that if M is 
a DFA over a unary alphabet, then the worst-case state complexity of L k is kn — k + 1 for 
all k > 2. 

We first recall some basic definitions. For further details see |E]. A deterministic finite 
automaton M is a quintuple M = (Q,Tj,6,q , F), where Q is a finite set of states; E is a 
finite alphabet; 5 : Q x E — > Q is the transition function, which we extend to Q x E* in 
the natural way; go £ Q is the start state; and F C Q is the set of final states. A DFA 
M accepts a word to G S* if 5(50,^) £ -F 1 . The language accepted by M is the set of all 
w E S* such that <5(g ,w) £ F 1 ; this language is denoted L(M). We denote the language 
L(M)L(M) by L 2 (M). We may extend this notation to higher powers by the recursive 
definition L k (M) = L k ^ 1 (M)L(M) for k > 2. 

2 State complexity of L 2 for binary alphabets 

In this section we consider the state complexity of L 2 for languages L over an alphabet of 
size at least 2. 

Theorem 1. For any integer n > 3, there exists a DFA M with n states such that the 
minimal DFA accepting the language L 2 (M) has n2 n — 2 n ~ 1 states. 

Proof. That the minimal DFA for L 2 (M) has at most n2 n — 2 n ~ 1 states follows from the 
upper bound of Yu, Zhuang, and Salomaa for concatenation of regular languages mentioned 
in the introduction. To show that n2 n — 2 n ~ l states are also necessary in the worst case 
we define a DFA M = (Q,£,5,0,F) (Figured), where Q = {0, . . . , n - 1}, E = {0,1}, 
F = {n — 1}, and for any i, < i < n — 1, 




Figure 1: The DFA M 

We will apply the construction of Yu, Zhuang, and Salomaa ^QJ Theorem 2.3] and show 
that the resulting DFA for L 2 (M) is minimal (see |B] for another example of this approach). 
Let M' = (Q', E, 5\ (0, 0), F'), where 
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• Q' = Qx2 Q - F x 2 Q -W; 

• F' = {(«, R) G Q' | R n F ^ 0}; and 

• <5'((z, -R), a) = (5(z, a), R), for all a G S, where 

_, /5(i?,a)U{0} if£(i,a)eF, 
I a) otherwise. 

Then L(M') = L 2 (M) and M' has n2 n - states. 

To show that M' is minimal we will show (a) that all states of M' are reachable, and (b) 
that the states of M' are pairwise inequivalent with respect to the Myhill-Nerode equivalence 
relation [HE]. 

To prove part (a) let (z, R) be a state of M', where R = {n, . . . , r^}. If G R, assume 
that rk = and r% < ■ ■ ■ < rk-i', otherwise, assume that r\ < ■ • • < rv For j = 1, . . . , k, 
define Sj as follows: 




(vj — 1) mod n if j : = 1, 
(jj — rj-i) mod n otherwise. 



If i = 0, we see that 

y((o,0),r(io) Sfc i n (io) Sfc - i ---i n (io) si ) = (o,r). 

If i > 0, then let i?' = {(n — z) mod n, . . . , (r^ — i) mod n}. Just as for (0, R), we see that 
(0, R) is reachable. Moreover, if z G F then G i2 and 1 G R. Hence, 5'((0, #'), 1*) = («, 
as required. 

To prove part (b) let (z, i?) and (j, 5) be distinct states of M' . We have two cases. 

Case 1: R ^ S. Then there exists r such that r is in one of R or 5 (say i?) but not both. 
If z G F, then r ^ 0. Hence l™- 1-7- ) e F' but 5), l^ 1 ^) £ F'. 

Case 2: R = S. Suppose z < j. Let i' = n — j + i. For T C Q, let Ti_^ denote the set 
(T \ {1}) U {0}. We have two subcases. 

Case 2i: ((j+1) mod n) <£ R. Then <$'((*, R), l n ' j ) = (i', R) for some R', and S'((j, 5), l n ' j ) = 
(0, 5') for some 5', where 1 ^ R' and 1 G S' . We may now apply the argument of Case 1 to 
the states (i',R r ) and (0,5"). 

Case 2ii: ((j + 1) mod n) G R. If i! ^ 1, then 5'((z, J2), l n ~ J ') = (z',iT) for some R', 
S'((i',R'),0) = (i',R[^ ), and 8'{{i', R^ ), V) = {i,R") for some R", where ((j + 1) mod 
n) G" i?". Similarly, <5'((j, S), l n ~ j ) = (0,5") for some 5', <5'((0, S"), 0) = (0,5^ ), and 
5'((0, S[_> ), V) = (j, S") for some 5", where ((j + 1) mod n) <£ S" . If i2" ^ 5", we apply 
the argument of Case 1 to the states (i,R") and (j, 5"); otherwise, we apply the argument 
of Case 2i. 

If i' = 1, then since i < j, i = and j = n—1. We thus have S'((0, R),0) = (0, Ri->q), and 
5'((0, #i-o), 1) = (1, #') for some R, where 2 g" f?'. Similarly, <5'((n-l, 5), 0) = (n-1, 5i_> ), 
and 5'((n— 1, 5i^ ) ; 1) = (0, 5') for some 5', where 2 G" 5'. If R! ^ 5', we apply the argument 
of Case 1 to the states (0, S') and (1, R'); otherwise, we apply the argument of Case 2i. □ 
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3 State complexity of L for unary alphabets 

In this section we show that the bound given in Theorem ^ does not hold if we restrict 
ourselves to unary languages. We also give optimal bounds for the state complexity of 
arbitrary powers L k of a regular language L. 

It is easy to see that the transition graph of a connected unary DFA M with n states is 
composed of a "tail" with fx > states and a "cycle" with A > 1 states, where n = u + A. 
Following Chrobak j3J, we therefore denote the size of M by the pair (A, u). 

Pighizzini and Shallit [S] give the following result regarding concatenation of unary DFAs. 

Theorem 2 (Pighizzini and Shallit). Let Lx,L 2 be unary languages accepted by DFAs 
of sizes (Ai, //i), (A2, H2) respectively. Then there exists a DFA M of size (A,/i) ; where A = 
lcm(Ai, A2) and jj, = ji\ + /i 2 + lcm(Ai, A2) — I, such that L(M) = L\L 2 . 

From Theorem |21 we can derive the following upper bound for the state complexity of L k . 

Theorem 3. Let L be a unary language accepted by a DFA with n states. For all k > 2, 
there exists a DFA M with kn — k + 1 states such that L(M) = L k . 

Proof. We prove the following by induction on k: if L is accepted by a DFA of size (A,//), 
where n = fi + A, then for all k > 2, there exists a DFA M of size (A, ku + (k — 1)A — k + 1) 
such that L(M) = L k . 

If k = 2, then an easy application of Theorem El with L\ = L 2 = L gives a DFA M of 
size (A, 2a + A - 1) such that L(M) = L? . 

If k > 2, then write L k = L k ~ 1 L. By induction, L k ~ l is accepted by a DFA of size 
(A, (k — l)fi + (k — 2) A — k + 2). Applying Theorem El with L 1 = L k ~ l and L 2 = L gives a 
DFA M of size (A, k/j + (k - 1)A - k + 1) such that L(M) = L k . The DFA M thus has 

A + k/j,+ (k - 1)A - k+ 1 
= kfj, + kX — k + 1 
= fc(ji + A)-fc + l 
= A;n — k + 1 

states, as required. □ 
The following theorem gives a matching lower bound for the state complexity of L k . 

Theorem 4. For any integers n, k, n > 2, k > 2, there exists a DFA M with n states over 
a unary alphabet such that the minimal DFA accepting the language L k (M) has kn — k + 1 
states. 

Proof We define a DFA M = (Q, E, 5, 0, F), where Q = {0, . . . , n-l}, S = {0}, F = {n-1}, 
and for any i, < i < n—1, 5(i, 0) = i+1 mod n. The transition graph of M is thus a directed 
n-cycle. Furthermore, L(M) = n - 1 (0 n )*. Hence, L k (M) = (0 n " 1 (0 n )*) fc = fc ( n " 1 )(0 n )*. The 
language L k (M) is accepted by the DFA M' = (Q', S, 5', 0, F'), where Q' = {0, . . . , kn - k}, 
F' = {kn — k}, for any i, < % < kn — k, 5'(i, 0) — i + and 8'{kn — k, 0) = kn — k — n + 1. 
The DFA M' is minimal, since every unary accessible and co-accessible DFA with a single 
final state is minimal. □ 



4 



4 Further work 

It remains to investigate the worst-case state complexity of L 3 , L 4 , etc. for general alphabets. 
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